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A zero-one language L is a regular language whose asymptotic probability converges to either zero 
or one. In this case, we say that L obeys the zero-one law. We prove that a regular language obeys the 
zero-one law if and only if its syntactic monoid has a zero element, by means of Eilenberg’s variety 
theoretic approach. Our proof gives an effective automata characterisation of the zero-one law for 
regular languages, and it leads to a linear time algorithm for testing whether a given regular language 
is zero-one. In addition, we discuss the logical aspects of the zero-one law for regular languages. 


1 Introduction 


Let L be a regular language over a non-empty finite alphabet A. Recall that the counting function 7 „(L) 
of L counts the number of different words of length n in L: 7 „(L) = |LnA"| where A" is the set of all 
words of length n over A. The probability function pLn{b) of L is the fraction defined by 


l^n{L) 


Yn{L) _ |LnA" 
7„(A*) |A«| 


The asymptotic probability ll{L) of L is defined by /f (L) = lim„^<x,/i„(L), if fhe limif exisfs. We can 
regard (L) as fhe probability fhaf a randomly chosen word of lengfh n is in L, and p. (L) as ifs asymptotic 
probability. Here we infroduce a new class of regular languages which is fhe main fargef of fhis paper. 

Definition 1 (zero-one language). A zero-one language L is a regular language whose asympfofic prob- 
abilify p (L) is eifher zero or one. In fhis case, we say fhaf L obeys the zero-one law. We denote by 
the class of all regular zero-one languages. 

As we will describe lafer (see Section]^, fhe nofion of “zero-one law” defined here is a fundamenfal 
objecf infinite model theory. 

Example 1. We now consider a few examples. 

• The sef of all words A* over A safisfies p{A*) = 1, and ifs complemenf 0 safisfies /f (0) = 0. These 
fwo languages obey fhe zero-one law. 

• Consider aA* fhe sef of all words which sfarf wifh fhe teller a in A. Then 


Pn(aA*) 


|A«| 


1 

R' 


Hence, ifs limif p{{aA)*) is 1/|A| and aA* is zero-one if and only if A is unary: A = {a}. 
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• Consider {AA)* the set of all words with even length. Then 


Ai„((A4)*) 


1 if n is even, 
0 if n is odd. 


Hence, its limit /i((AA)*) does not exist. 

Thus, for some regular language L, the asymptotic probability jj. (L) is either zero or one, for some, 
like L = aA* where |A| > 2, ft(T) could be a real number between zero and one, and for some, like 
L = (AA)*, it may not even exist. It is previously known that there exists a cubic time algorithm comput¬ 
ing ft (L) for any regular language L (||5l|, see Section]^. 


Our results and contributions. In this paper, we show that the following class of languages exactly 
captures the zero-one law for regular languages. 

Definition 2 (HDl). A language with zero is a regular language whose syntactic monoid has a zero 
element. We denote by ^ the class of all regular languages with zero. 

More precisely, we prove the following theorem, which states that and ^ are equivalent by 
means of a transparent condition of their automata: zero automata (Section]^ and quasi-zero automata 
(Section 1^ which will be described later. The remarkable fact is that, .2°^ = ^ holds even though these 
two notions seem completely different from each other; is defined by the asymptotic behavior of its 
probability, is defined by the existence of a zero of its syntactic monoid. 

Theorem 1. Let L be a regular language and be the minimal automaton of L. Then the following 
four conditions are equivalent. 

@ is zero. 

( 2 ) L is with zero. 

@ L obeys the zero-one law. 

@ L is recognised by a quasi-zero automaton. 

We will prove this theorem as a cyclic chain of implications: © => (§!)=> djl) ©, and © 4^ @ in¬ 
dependently. We should notice that the most difficult part of this proof is the implication djl) ©, while 
the former part (Jj) =► (Jl) => (gj is easy. The key points of the proof of this part are closure properties of 
and Lemma [TJ which comes from Eilenberg’s variety theorem. The automata characterisation 0 
of Theorem [T] leads to a linear time algorithm for testing whether a given regular language is zero-one. 
In addition, our automata theoretic proof sheds new light on the relation between the zero-one law for 
regular languages and logical fragments over finite words. 


Paper outline. The remainder of this paper is organised as follows. In Sectionwe first give the nec¬ 
essary definitions and terminology for languages, monoids, and automata. Lemma [T] will be introduced 
in this section. For the sake of completeness we include the proof of Lemma [T] Section provides a 
detailed exposition of the notion of zero automata. Our automata theoretic proof of Theorem [T] consists 
of three parts: (i) Check certain closure properties of (Section |^, (ii) Apply Lemma [T] to prove 
the implication ® ® (Section]^, (iii) Generalise the notion of zero automata, and prove © <;a ® 

(Section 1^. In Sectionwe will give a linear time algorithm (Theorem]^. The logical aspects of our 
results are investigated in Section]^ Finally, we discuss some related works of our results and conclude 
this paper in Section We try to keep all sections as self-contained as possible. 


174 


An Automata Theoretic Approach to the Zero-One Law for Regular Languages 


2 Preliminaries 


In this paper, all considered automata are deterministic finite, complete and accessible. We refer the 
reader to the book by Sakarovitch |[T8]| for background material. 


Languages and monoids. We denote by A* [A”] the set of all words [of length n\ over a nonempty finite 
alphabet A, and by |w| the length of a word w in A*. The empty word is denoted by e. That is, A* is the 
free monoid over A with the neutral element e. We can easily verify that 


|A^LnA”+^|_|A^(LnA«)| 

Pn+k[A |A'^A”| 


|LnA” 

|A”| 


Pfi (L) 


holds for any language L of A* and k > 0. It follows from what has been said that p {A^L) exists if and 
only if p{L) exists and in that case they are equal p{A^L) = p{L). If two languages L and K of A* are 
mutually disjoint (LOK = 0), then clearly p{L\JK) = p{L) p{K) holds if both p{L) and p{K) exist. 
We say that v is a factor ofw if, there exists x,y in A* such that w = xvy. Let L be a language of A* and 
let M be a word of A*. The left [right] quotient u^^L [Lu^^] of L by n is defined by 

u^^L = {v G A* I MV G L} and Lu^^ = {v G A* | vm G L}. 


We denote by L = A* \ L the complement of L. The syntactic congruence of L of A* is the relation 
defined on A* by m ~z, v if and only if, xuy G L xvy G L holds for all x,y in A*. The quotient A*/ 
is called the syntactic monoid of L and the natural morphism : A* ^ A* / is called the syntactic 
morphism of L. If M is a monoid, an element 0 in M is said to be a zero if, Om = niQ = 0 holds for all m 
inM. 


Automata and an important lemma. An (complete deterministic finite) automaton over a finite alpha¬ 
bet A is a quintuple .e/ = {Q,A,-,qo,F) where 

• 2 is a finite set of states', 

• •:2xA—)-2isa transition function, which can be extended to a mapping ■: Q x A* ^ Q by 
q-e = q and q-aw = {q-a) -w where q € Q,a € A and w G A*; 

• ^0 £ 2 is an initial state, and F C 2 is a set of final states. 

The language recognised by sZ is denoted by L(sZ) = {w G A* \qQ-w ^ F}. We say that recognises L 
if L = L{.s/). It is a basic fact that, for any regular language L, there exists a unique automaton recognises 
L which has the minimum number of states: the minimal automaton of L and we denote it by sZl. Each 
word w in A* defines the transformation w '. qr^ q-w on Q. The transition monoid of sZ is equal to the 
transformation monoid generated by the generators A. It is well known that the syntactic monoid of a 
regular language is equal to the transition monoid of its minimal automaton. 

For any subset P of Q, the past of P is the language denoted by Past(F) and defined by 

Past(F) = {w G A* I <70 • w G F}. 

Dually, the future of a subset F of 2 is the language denoted by Fut(F) and defined by 


Fut(F) = {w G A* I G P,p- w G F}. 
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It is well known that, an (accessible) automaton £/ is minimal if and only if the following condition 


p = q 4^ Fut(p) = Fut(( 7 ) 


(M) 


holds for every pair of states p,q in Q. Myhill-Nerode theorem states that every regular language has 
only a finite number of left and right quotients. 

In Section]^ to prove Theorem [T] we will use the following technical but important lemma. For the 
sake of completeness we include the proof, which is essentially based on “Proof of Theorem 3.2 and 
3.2s” in the book fJl by Eilenberg. 

Lemma 1. Let£/i = {Q,A,-,qo,F) be the minimal automaton of a language L. Then for any subset P of 
Q, its past Past(P) can be expressed as a finite Boolean combination of languages of the form Lw^^. 

Proof We only have to prove that, for any state q in Q, its past Past(^) can be expressed as a Boolean 
combination of languages of the form Lw^^. Our goal is to prove the following equation with the usual 
conventions ^ ' = ®‘ 



( 1 ) 


The finiteness of this Boolean combination follows from Myhill-Nerode theorem. 

We prove first that the left hand side is contained in the right hand side in Equation ([T]). Let v be a 
word in Past(^). If a word w in Fut(^), then vw in L by the definition, and hence v in Lw^^. If a word 
w not in Fut(^), then vw not in L by the definition, and hence v not in Lw^^. It follows that the left hand 
side is contained in the right hand side in Equation ([T]l. 

Then we prove that the right hand side is contained in the left hand side in Equation ([T]l. Let v be 
a word in right hand side in Equation ([T]l. Let p be the state satisfies qo-v = p, fhaf is, v is a word in 
Past(p). Eor any w in Fut(^), by fhe form of Equafion Q. V is in Lw ' from which we gef vw in L 
whence p-w in F. Thai is, w also belongs fo Fut(p). Conversely, for any w nof in Fut(^), vw is nol in 
L and fhus v nol in Lw^^ Thai is, w does nol belong lo Fut(p). If follows lhal p and q have fhe same 
fulure Fut(p) = Fut(( 7 ) from which we gel p = qhy Condition ( |Ml ) of fhe minimalify of sFl. Hence we 
obfain v in Past((7) and fhus fhe righl hand side is conlained in fhe lefl hand side in Equafion ([T]l. □ 

Remark 1. A variety of languages is a class of regular languages closed under Boolean operafions, lefl 
and righl quolienls and inverses of morphisms. The algebraic counlerparl of a variely is a (pseudo)variety 
of finite monoids', a class of finife monoids closed under faking submonoids, quolienls and finite direcf 
producls (c/ Util). Eilenberg’s variely Iheorem Q slates lhal varieties of languages are in one-lo-one 
correspondence wilh varieties of finite monoids. Lemma [T] shows us an importance of Ihe Boolean 
operations laken in landem wilh quolienls. While Ihis lemma is known {cf fg]), which is an “automaton 
version” of a key lemma in Eilenberg’s variely Iheorem, we have nol found any lileralure lhal includes a 
complete proof. 


3 Zero automata 


In Ihis secilon, we inlroduce a zero automaton, which plays a major role in our work. In conlrasl to Ihe 
class of monoids wilh zero, Iheir nalural counlerparl, Ihe class of zero aulomala has nol been given much 
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Figure 1: Zero and non-zero automata 


attention. To the best of our knowledge, only few studies (e.g., ifTTll l have investigated zero automata in 
the context of the theory of synchronising word for Cerny’s conjecture. 

Let be an automaton {Q,A,-,qo,F). For each pair of states p,q in Q, we say that q is reachable 
from p if, there exists a word w such that p-w = q. £/ is called accessible if every state ^ in 2 is reachable 
from the initial state q^. A subset P of 2 is called strongly connected component, if for each state q in 
P, q is reachable from every other state in P. A state ^ in 2 is said to be sink, if q-a = q holds for every 
letter a in A. We say that a subset P of 2 is sink, analogously, if there is no transition from any state p 
in P to a state which does not in P. That is, Q\P are not reachable from P. Note that, every (complete) 
automaton has at least one strongly connected sink component. The family of all strongly connected sink 
components of £/ is denoted by Smk{sZ). A strongly connected component P is trivial if it consists of 
some single state P = {p}. We shall identify a singleton {p} with its unique element p. A word w is 
a synchronising word of sZ if, there exists a certain state q in Q, p-w = q holds for every state p in Q. 
That is, w is the constant map from Q to q. We call an automaton synchronising if it has a synchronising 
word. Note that any synchronising automaton has at most one sink state. As we will prove in Section]^ 
the following class of automata captures precisely the zero-one law for regular languages. 

Definition 3 t lfTTl l. A zero automaton is a synchronising automaton with a sink state. 

Example 2. Consider two automata and illustrated in Figure [T] . 2 ^ is a zero automaton but siZ\ is 
not, though both automata have a sink state ^ 5 . The only difference between sZq and is the transition 
result of ^4 • a; which equals to ^5 in 42 ^ 0 , while which equals to q^ in sZ\. We can easily verify that, sZq has 
a unique strongly connected sink component q^, while sZ\ has two strongly connected sink components 
{^ 3 ,^ 4 } and ^ 5 . 

Definition can be rephrased as follows. 

Lemma 2. Let = {Q,A,-,qQ,F) be an automaton. Then sF is zero if and only if £/ has a unique 
strongly connected sink component and it is trivial, i.e., Sink(j 2 /) = {{p}}for a certain sink state p. 

Proof. First we assume sZ is zero with a sink state p. Then there exists a synchronising word w and it 
clearly satisfies q-w = p for each qinQ since p is sink. This shows that there is no strongly connected 
sink component in Q\p. 

Now we prove the converse direction, we assume .sf has a unique strongly connected sink component 
and it is trivial, say p. We can verify that for every state q in Q, there exists a word w in A*, such that 
q-w = p. Indeed, if there does not exist such word w for some q, then the set of all reachable states 
from q -. {r ^ Q\3w ^ A* ,q - w = r] must contains at least one strongly connected sink component which 
does not contain p. This contradicts with the uniqueness of the closed strongly connected component 
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0 


Wo = Uq 


Wo 


WqWi 


-► (gi ■ wq) - 
• {q2 ■ wpwi} 


Wl — '^(qi-wo) 


^2 '^(g2-woWi) 


-@3 

■@3 

@3 


Wl ■■ -Wn-l 

W2 ■■ - Wn-l 

W3 ■ --Wn-l 


& 


■@3 


Wo •■ - Wn-l 


Figure 2: Synchronising word v„_i = wq • • • w„_i in the proof of Lemma]^ 


p in £/. The existence of a synchronising word w is guaranteed, because we can concretely construct 
it as follows. Let n be the number of states n = \Q\ and let Q = {<7or" = p}- We define a 

word sequence w,- inductively by wq = Uqg and w, = ^ where each Uq. is a shortest word satisfies 

qi • Uq. = p, and v,_i is the word of the form wq ■ ■ As shown in Figure]^ we can easily verify that 

the word v„-i = wq • • • w„-i is a synchronising word satisfies q ■ v„-i = p for each q in Q. 

For example, consider fhe zero aufomafon in Figure Then each Uq ^, Wq. and Vq. are defined as 
follows. 



Uq, 

Wqt 


qo 

aab 

aab 

aab 

qi 

ab 

b 

aabb 

qi 

b 

£ 

aabb 

<?3 

aa 

£ 

aabb 

qA 

a 

£ 

aabb 

qs 

£ 

£ 

aabb 


The obfained word Vq^ = aabb is a synchronising word which salisfies qi • aabb = q^ for all qi in If 
is clear fhaf fhe non-zero aufomafon in Figure [T] does nof have a synchronising word since if has fwo 
sfrongly connecfed sink componenfs. □ 


4 Closure properties of 

We firsf infroduce fhe following lemma. 

Lemma 3. Let Lbe a language of A* and w be a word in A^. Then the asymptotic probability ofL exists 
if and only if the asymptotic probability of the language wL [Lw] exists. Moreover, these limits satisfies 
the equation ft(wL) = /r(Lw) = |A|^^/i(L). 


Proof Since wL and Lw clearly have fhe same counting funcfion, we only have fo prove fhe case of wL. 
For every u,v in such fhaf u / v, fhe language uL and vL are obviously mufually disjoinf and fhese 
counting functions satisfies 



n <k, 
n> k. 


Yn{uL) = Yn{vL) 
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This shows that uL and vL have the same counting function and thus have the same asymptotic probability 
if its exists. We can easily verify that 

p{L) = p = £ p{uL) = |A|V(>vL) 


holds for any w in A*^. 


□ 


Now we prove the following proposition, which states the necessary closure properties of the class 
for Lemma [T] 

Proposition 1, is closed under Boolean operations, left and right quotients. 

Proposition^ We first prove that is closed under Boolean operations, and then prove that is 
closed under quotients. 


SLff is closed under Boolean operations. Let L, K be two languages in It is obvious that iSfff is 
closed under complement since p{L) = I — p{L) G {0,1}, and we can easily verify that the following 
equations holds. 

• p{LUK) = 0 if p{L) = 0 and p{K) = 0; 

• p{LnK) =0if either p{L) = 0 or p{K) = 0; 

• p{LUK) = \ if either /i(L) = 1 or p{K) = 1; 

• piLDK) = 1 if p{L) = 1 and p{K) = 1. 


is closed under quotients. We first prove that is closed under left quotients. Let L be a regular 
language in and we assume that L does not contain e without loss of generality. First we assume 
p{L) = 0. By the definition of left quotients, one can easily verify that 

L= {jLDaA* = [Jaa^^L 

acA aeA 


holds (since efiL) and all these sets aa ^L {=LOaA*) are mutually disjoint. It follows that the following 
equation holds. 


p{L) 


71—yoo 

laa^'LnA"! 

hm > 

T) —^CO 


ci^A. 


I A” I 


(Ug^aa-^L)nA^I |U,g^(aa-^LnA>^)| 

|A«| n™ |A«| 

= £ p{aar^L) = 0 . 

aeA 


That is, the asymptotic probability p{aa^^L) equals to zero for each a in A, since these summation 
converges to zero. In addition, p{aa^^L) coincides with p{a^^L) for any a in A, because p{aa^^L) = 
|A|^'/r(a^'L) = 0 by Lemmaj^whence p{a^^L) = 0. 

Next we assume /r (L) = 1. Then p{L) =0 and 

a^^L = {w G A* I flw G Lj = {w G A* I aw ^ Lj = a^^L 


holds. We therefore obtain: 

pia^^L) = 1 -p{a-^L) = f-pia^^L) = 1-0=1. 
We can prove that is closed under right quotients by the same manner. 


□ 
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5 Equivalence of and ^ 

We will use the following lemma, which is a direct consequence of Lemma [T] and Proposition [T] 

Lemma 4. Let L be a regular language in = {Q,A,-,qQ,F) be its minimal automaton. Then, 

for any subset P of Q in its past Past(P) is also in 3TG. 

Proof By Lemma[2 for any subset P of Q, its past Past(P) can be expressed as a finite Boolean combi¬ 
nation of languages of the form Lw^'. It follows that Past(P) obeys the zero-one law, since L is in fFG 
and is closed under Boolean operations and quotients by Proposition [T] □ 

Lemmaj^will be used for proving the direction (21) Now we give a proof. 

Proof of Theorem^ We show the implication (Jj) =► (2l) =► (2I> =► ®- The former implication ® => ® => 
® is easy and almost folklore, but we include a proof here to be self-contained. 

® (0 {■s^L is zero Lis with zero). Let = {Q,^ri^o,F) be the minimal automaton of L and it is 

zero with a sink state p. Let M be the transition monoid of ^ 2 ^ and 0 : A* —M be the syntactic morphism 
of L. Then we can verify that M has a zero element 0 as the transformation 0 \ p ^ox all q in Q, that 
is, 0 is the constant map from Q to p. The existence of 0 is guaranteed since .sTi^ is synchronising. Indeed, 
for any synchronising word w, 0(w) = 0 holds. One can easily verify that mO = Qm = 0 for all m in M. 
This proves that M the syntactic monoid of L has the zero. 

0 =► 0 (L is with zero L obeys the zero-one law). Let L be a regular language in fF, M be its 
syntactic monoid with a zero element 0 and 0 : A* —)■ M be its syntactic morphism. We choose a word 
Wo from the preimage of 0: wq G 

Now we prove p{L) = I if wq in L. By the definition of zero, we have 

0(xwoy) = (l>{x)(l>{wo)(j){y) = ([) {x)0(j) (y) =0 

for any words x,y in A*. That is, if w contains wq as a factor, then ^(w) = 0(wo) = 0 holds and hence 
w also in L. Let L„^ = A*woA* be the set of all words that contain wq as a factor. Then clearly L^^■g is 
contained in L from which we get p„ ) < p„ (L) for all n. The probability p„ {L „^) is nothing but the 
probability that a randomly chosen word of length n contains wq as a factor. The following well known 
elementally fact, sometimes called Borges’s theorem (cf Note 1.35 in ifTOll l. ensures that /i„(Lh.q) tends 
to one if n tends to infinity. This shows p{L) = p. {L^^.^) = 1 and we can prove p{L) =0 if wq not in L by 
the same manner. 

Borges’s theorem. Take any fixed finite set IT of words in A*. A random word in A* of length n contains 
all the words of the set IT as factors with probability tending to one exponentially fast as n tends to 
infinity. 

0 0 (L obeys the zero-one law is zero). Let L be a regular language in l3fG and = 

{Q,A,-,qo,F) be its minimal automaton, let S\t\k{s^i) = {Pi,-- - ,P/t} for some k > 0. Our goal is to 
prove k=l and Sink(.e4) = {{p}} for a certain sink state p. It follows that is zero by Lemma|^ 

For any strongly connected sink component /)■, there exists a word w, such that qo - w,- in /)• because 
■sFi is accessible. Since Pi is sink, the language w;A* is contained in Past(/^) from which we get 

0 < p{wiA*) = |Arl-'V(A*) = lAl-l'’''! < At(Past(/>-)) 


( 2 ) 
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for each Pi by Lemma[^ Lemma|^and Equation Q implies that the asymptotic probability /r (Past(E,)) 
surely exists and satisfies 


At(Past(P,)) = 1 (3) 

for every strongly connected sink component p. 

Now we prove k = 1. By Equation Q, we can easily verify that 

p ^lJPast(P)^ = £/i(Past(P)) =k 

holds because ^ is deterministic and thus all Past(p) are mutually disjoint. This clearly shows k= I, 
that is, there exists a unique strongly connected sink component, say P, in Sink(j24) = 

Next we let P = {pi,• • • ,pn} and prove n = 1. Since P satisfies /i(Past(P)) = 1 by Equafion Q, 
there exists exactly one state p in P satisfies p(Past(p)) = 1 by Eemma]^ Eurfher, because P is sfrongly 
connecfed, for every sfafe p,- in P, fhere exisfs a word w; such fhaf p • w, = p,. If follows fhaf Past(p)>Vi C 
Past(p,) and fhus 

0 < p(Past(p)w,-) = |Arl^'V(Past(p)) = < p(Past(p;)) = 1 (4) 

holds for every sfafe p, in P by Eemmaj^and Eemma]^ Equafion Q and Q implies 

p(Past(P)) = ^p(Past(p,)) = ^ 1 = n = 1, 

(=1 /=i 

because is deferminisfic and fhus all Past(p,) are mufually disjoinl. We now obfain n = I, fhaf is, P 
is singleton and hence Sink(j24) = {p}- Thai is, is zero. □ 

Remark 2. If is inleresling fhaf, Ihough we use Borges’s Iheorem to prove Ihe direclion (Ijl) => (|J|), The¬ 
orem [T] is a vasl generalisalion of Borges’s Iheorem, since any language of fhe form A*KA* where K 
is regular is always recognised by a zero aufomalon (buf fhe converse is nol frue). To sfafe Theo¬ 
rem [T] more precisely, by fhe proof above we can easily verify fhaf, a zero-one language L salisfies 
p (L) = 1 [p (L) = 0] if and only if ils minimal aulomafon is zero and fhe sink sfafe of is final 
[non-final]. 

6 Linear time algorithm for testing the zero-one law 

The equivalence of zero-aulomafa and fhe zero-one law gives us an effecfive algorifhm. Eor a given 
n-slales automaton we can delermine whefher L{.sZ) obeys fhe zero-one law by fhe following steps: 
(i) Minimise sZ fo obfain ils minimal automaton (ii) Calculate fhe family of all sfrongly connected 
componenfs P of (iii) Check whefher P conlains exacfly one sfrongly connected sink componenl and 
if is Irivial, i.e., whefher is a zero automaton (Eemma[^. If is well known fhaf Hopcroff’s aulomafon 
minimisation algorifhm has an O(nlogn) time complexify and Tarjan’s sfrongly connecfed componenfs 
algorifhm has an 0(n + n|A|) = 0{n) complexify where n|A| means fhe number of edges. Hence we can 
minimise sZ fo obfain PS in 0(nlog?i) on fhe step (i), and can calculate P in 0{n) on fhe step (ii). One can 
easily verify fhaf fhe step (iii) above can be done in 0{n). To sum up, we have an 0(nlog?i) algorifhm 
for lesling whefher a given regular language obeys fhe zero-one law, if ils is given by an n-slales deler- 
minislic finile aufomalon. We can obfain, however, more efficienl algorifhm by avoiding minimisation. 
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In order to do that, there is a need for further investigation of the structure of zero automata. 


Quasi-zero automata and more effective algorithm. Let £/ = {Q,A,-,qo,F) be an automaton. The 
Nerode equivalence ~ of js/ is the relation defined on Qhy p q if and only if Fut(p) = Fut(^). One 
can easily verify that ~ is actually a congruence, in the sense that F is saturated by ~ and p ^ q implies 
p • w ~ ^ • w for all w G A*. Hence it follows that there is a well defined new automaton /~, the quotient 
automaton of ^: 


where is the equivalence class modulo ~ of q, S/~= {[^].^ | ^ G 5} is the set of the equivalence 
classes modulo ~ of a subset S F Q, and where the transition function ■ Qj^ xA —> 2/~ is defined 
by [p]^ ■ a = [p ■ a]^. We define the natural mapping Q/^hy = [<?]~- Condition (Ml 

for minimal automata implies that, for any automaton siF, its quotient automaton sF is the minimal 
automaton of LlysF). We shall identify the quotient automaton with the minimal automaton of 

{cf oa). 

We now introduce a new class of automata which is a generalisation of the class of zero automata. 


Definition 4 (quasi-zero automaton). An automaton sF = {Q,A,-,qQ,F) is quasi-zero if either IJ S\nk{s^) 
C F or lJSink(j 2 /) nF = 0 holds. 

Since every zero automaton satisfies |JSink(j 2 /) = {p} for a certain state p (Lemma|^, every zero 
automaton is quasi-zero. The following proposition shows that the minimal automaton of any quasi-zero 
automaton is zero and vice versa (this justifies fhe ferm “quasi-zero”). 


Proposition 2. An automaton sF = {Q,A,-,qQ,F) is quasi-zero if and only if sFis zero. 


Proof This proposition shows exactly the equivalence © (J)) in Theorem [T] 


lU) @ {sF is zero ^ sF quasi-zero). Let p be the unique sink state of sFj^. To prove this 
direction, it is enough to consider the case when p G F/~, i.e., Fut(p) = A*. We now show 

lJSink(.(^) CF (5) 

by contradiction. Let us assume that Inclusion (|^ does not hold, that is, we assume there exists a non¬ 
final state ^ in U Sink(j 2 /). Let P be the strongly connected sink component of .sF that contains q. Since 
P is sink and strongly connected, ^r..{P) is sink and strongly connected in sF too. Moreover, 0.^(F) 
does not contain the sink state p, because q ^ F implies that, for any state q' in P, Fut(( 7 ') / A* from 
which we obtain Fut([^']...,) / Fut(p) and p. That is, sF has at least two strongly connected 

sink components ^r^{P) and p. This is contradiction. 


@ => ® i-sF is quasi-zero is zero). To prove this direction, it is enough to consider the case 

when lJSink(.(2/) C F. Since ^F is quasi-zero, all states in USink(j2/) have the same future A*, i.e., 
Fut(< 7 ) = A* for every state q in |JSink(j2/), because |JSink(j2/) C F implies G F for every state q 
in |JSink(j 2 /) and every word w in A*. This implies that |JSink(j 2 /)/~ consists of a single equivalence 
class, say p. Moreover, this equivalence class p is a sink state in by the definition of sink and 
Condition (Ml of the minimality of sFj^. We now show that, by contradiction, has only one 

strongly connected sink component p\ 


lJSink(,s2//~) = {p} 


( 6 ) 
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from which we obtain £//~ is zero by Lemma Let us assume that Inclusion Q does not hold, that 
is, we assume there exists a strongly connected sink component /? = {ri, • • • ,r„} of which does 

not contain p. Recall that each state r, of is an equivalence class, i.e., a set of states, of £/. Let 
S = (pZ^{R) be a set of states of £/. Since R is strongly connected sink component of its preimage 
S contains at least one strongly connected sink component, say P, of £/. For every state q in P, Fut(g) 
is not equal to A* = Fut(p), because p ^ C 0,^(5') =R implies ^ p. This contradicts with 

the assumption that Fut(^) = A* for every state q in |JSink(j 2 /). This completes the proof of Theorem 

[U □ 

By using this proposition, we obtain a linear time algorithm by avoiding minimisation as stated in 
the following theorem. 

Theorem 2. There is an 0{n) algorithm for testing whether a given regular language is zero-one, if its 
is given by an n-states deterministic finite automaton. 

Proof For a given n-states automaton sZ, we can determine whether L(i 2 ^) obeys the zero-one law by 
the following steps: (i) Calculate the family of all strongly connected components P of sZ. (ii) Extract 
all strongly connected sink components from P to obtain Sink(j 2 /). (iii) Check whether, in USink(j 2 /), 
either all states are final or all states are non-final, i.e., whefher is quasi-zero. By Theorem [T| L{£^) 
obeys fhe zero-one law if and only if is quasi-zero. Hence fhis algorifhm is correcf. All steps (i) ~ 
(iii) can be done in 0{n), fhis ends fhe proof. □ 


7 Logical aspects of the zero-one law 

There are differenf manners fo define a language: a sef of finite words. In fhe descripfive approach, 
fhe words of a language are characferised by a properly. The aufomala approach is a special case of 
the descriptive approach. Another variant of the descriptive approach consists in defining languages by 
logical formulae: we regard words as finite structures with a linear order composed of a sequence of 
positions labeled over finite alphabet. The zero-one law, which is defined in fhis paper, has been sludied 
extensively in finife model Iheory {cf. Chapfer 12 “Zero-One Laws” of lfT4ll ). This notion can be applied 
lo logics over, nol only finite words, bul also arhitrary finite structures, such as finite graphs: we regard 
graphs as finite slruclures wifh a sef of nodes and fheir edge relafion. We say fhal a logic .if, over 
fixed finile slruclures, has fhe zero-one law if every properly <I> definable in .if salisfies G {0,1} 
{p is defined analogously). Broadly speaking, every properly <I> is eilher almost surely true or almost 
surely false. Fagin’s fheorem f91 slafes that first-order logic FO for finite graphs has fhe zero-one law. 
Moreover, an FO sentence <I> is almosl surely Irue (i.e., p{^) = 1) if and only if <I> is frue on a cerfain 
infinite graph: fhe random graph. This characferisafion leads fo fhe fad fhal, for any FO sentence <I>, if 
is decidable whefher ft (<I>) = 1 (cf Corollary 12.11 in US). After fhe work of Fagin, much ink has been 
spent on the zero-one law for logics over finite graphs. It is now known that many logics (e.g., logic 
with a fixed point operator finite variable infinitary logic l[T2l and certain fragments of second-order 
logic lIT^ ) have the zero-one law. 

By contrast, though many logics have the zero-one law, their extensions with ordering (like as logics 
over finite words), no longer have it. In fact, over both finite graphs and finite words, while first-order 
logic FO has the zero-one law, its extension with a linear order FO[<] does not. 

Example 3. A simple counterexample is the language {aA)* which can be defined by fhe FO[<] senlence 
^aA* = 3/(V7(i < j) f\Pa{i)) ■ The variables i and j of fhis sentence represenl position in a word. The 
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Figure 3: Logical and algebraic characterisations of well known subclasses of regular languages. 

sentence Pa{i) is interpreted to mean “the /-th letter is a”. This language aA* satisfies /r„(aA*) = 1/|A| 
as we stated in Section [T| hence ^aA* does not obey the zero-one law in general. It follows that FO[<] 
for finite words does not have the zero-one law. 

We summarise well known logical and algebraic characterisations of classes of languages, including 
the class of zero-one languages in Figure Details and full proofs of these results can be found 
in a very nice survey |61 by Diekert et al. In Figure we use standard abridged notation: FO" [<] for 
first-order logic with n variables; £„[<] for FO formulae with n blocks of quantifiers and starting with 
a block of existential quantifiers; Br„[<] for the Boolean closure of L„[<]. A monomial over A is a 
language of the form Aq^iA jfl :2 • • • auAl where a, in A and A, C A for each i, and is unambiguous if for 
all w € AQaiA*a 2 ■ ■ -akAl there exists exactly one factorisation w = woaiwia-n, ■ • -aj^Wk with Wi in A* for 
each i. A language L over A is called: 

• star-free if it is expressible by union, concatenation and complement, but does not use Kleene star; 

• polynomial if it is a finite union of monomials; 

• unambiguous polynomial if it is a finite disjoint union of unambiguous monomials; 

• piecewise testable if it is a finite Boolean combination of simple polynomials; 

• simple polynomial if it is a finite union of languages of the form A*aiA*fl :2 • • • akA*. 

The question then arises as to which fragments o/FO[<] over finite words have the zero-one law. 
The algebraic characterisation of the zero-one law partially answers this question. Since every ^-trivial 
syntactic monoid has a zero element (c/ ifT^ l. Theorem [T] leads to the following corollary. 

Corollary 1. The Boolean closure of existential first-order logic over finite words has the zero-one law. 

One can easily verify that the sentence ^aA* in examplej^ which only uses two variables i and j, is in 
FO^[<]. It follows that FO^[<] does not have the zero-one law, hence Corollary shows us a “separation 
line” (red line in Figure]^. It must be noted that the class of zero-one languages and unambiguous 
polynomials are incomparable. To take a simple example, consider two languages {aa)* and aA* over 
A = {a,b}. The language (aa)* is zero-one but not unambiguous polynomial since its syntactic monoid 
is not aperiodic (i.e., having no nontrivial subgroup). Conversely, aA* is not zero-one but unambiguous 
polynomial since it is definable in FO^[<] as we have stated in Example]^ An interesting open problem 
is whether there exists a logical fragment that exactly captures the zero-one law. 
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8 Related works 

The notion of probability /r„ for regular languages has been studied by Berstel ||T1 from 1973, and by 
Salomaa and Soittola IT9l from 1978 in the context of the theory of formal power series. They proved that 
Pn (L) has finitely many accumulation points and each accumulation point is rational. Another approach, 
based on Markov chain theory, was presented by Bodirsky et al. Q. They investigate the algorithmic 
complexity of computing accumulation points of L and introduced an O(n^) algorithm to compute p{L) 
for any regular language L (and hence whether L is zero-one), if L is given by an n-states deterministic 
finite automaton. 

A similar notion, density of a language have also been studied in algebraic coding theory (cf mmy 
A probability distribution 7t on A* is a function 7t : A* ^ [0,1] such that 7t{e) = 1 and = 

7t{w) for all w in A*. As a particular case, a Bernoulli distribution is a morphism from A* into [0,1] 
such that Y^aeA^i^) = 1- Clearly, a Bernoulli distribution is a probability distribution. We denote by 
A(") = A° U a U • • • U A"^' the set of all words of length less than n over a finite alphabet A. The density 
5 (L) of L is a limit defined by 

5(L) = lim-TrfLnAW) 

where tt is a probabilify disfribufion on A*. A monoid M is called well founded if if has a unique minimal 
ideal, if moreover fhis ideal is fhe union of fhe minimal lefl ideals of M, and also of fhe minimal righf 
ideals, and if fhe infersecfion of a minimal righf ideal and of a minimal lefl ideal is a finile group. An 
elementary result from analysis shows that if the sequence 7r(LnA”) has a limit, then 8{L) also has a 
limit, and both are equal. The converse, however, does not hold (e.g., 5((AA)*) = 1/2). In their book |5l, 
Berstel et al. proved Theorem 13.4.5 which states that, for any well founded monoid M and morphism 
<p -.A* ^ M, 5(0^^ (m)) has a limit for every ni in M. Furthermore, this density is non-zero if and only 
if m in the minimal ideal K of M from which we obtain = 1. Since every monoid with 

zero is well founded. Theorem 13.4.5 implies that, every language with zero is zero-one (i.e., (Jj) =► (gl), 
“easy part” of our Theorem[T]l. Some other related results can be found in the theory of probabilities on 
algebraic structures initiated by Grenander ifTTI and Martin-Lof ifTSl . 

The point to observe is that the techniques presented in this paper are purely automata theoretic. We 
did not use any probability theoretic tools, like as measure theory, formal power series, Markov chain, 
algebraic coding theory, etc. This point deserves explicit emphasise. 
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